Towards A New Generation Of Terminological Resources: An Experiment In Building A Terminological Knowledge Base
نویسندگان
چکیده
This paper describes a project to construct a terminological knowledge base, called COGNITERM. First, we position our research framework in relationship to recent developments in computational lexicology and knowledge engineering. Second, we describe the COGNITERM prototype and discuss its advantages over conventional term banks. Finally, we outline some of the methodological issues that have emerged from our work. 0 I N T R O D U C T I O N The discipline of terminology I has received surprisingly little focussed attention in the literature of computational linguistics an unfortunate situation given that NLP systems seem to be most successful when applied to specialized domains. We say focussed attention because when specialized lexical items are discussed in the literature, the research problems are often not clearly differentiated from the problems of non-specialized lexical items. A fundamental assumption of our research is that, while terminology can certainly benefit from advances in computational lexicology, it nonetheless has its own non-trivial research problems, which are ultimately related to the quantity and types of specialized world knowledge that terminological repositories must contain. At the Artificial Intelligence Laboratory of the University of Ottawa, we .are constructing a new type of terminological repository, COGNITERM, which is essentially a hybrid between a term bank and a knowledge base, or a terminological knowledge base (TKB). COGNITERM is a bilingual (French/English) TKB constructed using a generic knowledge engineering tool (CODE) that has been used in terminology, software engineering and database design applications. The COGNITERM Project (1991-94) is focussing on the domain of optical storage technologies (e.g. optical discs, drives, processes, etc.). In Section 1 of the paper, we position our research in relation to recent developments in com1 Slmce constraints preclude even a brief description of the discipline of terminology. Cf. Sager 1990. putational lexicology and knowledge engineering; in Section 2, we describe the structure of COGN1TERM as well as some of its advantages over conventional term banks; in Section 3, we outline some methodological issues that have emerged from our work. 1 R E S E A R C H I S S U E S IN COMPUTATIONAL T E R M I N O L O G Y 1.1 Terminological v s . Lexieal Knowledge Bases Much of the world's terminological data is stored in large terminological databases (TDBs) such as Canada's TERMIUM III, which contains over one million bilingual records. These TDBs are useful only to humans, and even then to only a small subset of potential users: translators remain the principal user category, even though TDBs have obvious applications in technical writing, management information and domain learning, not to mention a wide variety of machine uses such as information retrieval, machine translation and expert systems. A major weakness of TDBs is that they provide mainly linguistic information about terms (e.g. equivalents in other languages, morphological information, style labels); conceptual information is sparse (limited to definitions and sometimes contexts), unstructured, inconsistent and implicit. Given these problems, a growing number of terminology researchers are calling for the evolution of TDBs into a new generation of terminological repositories that are knowledge-based. Since this vision of a TKB has been recently paralleled in computational lexicology by the vision of a lexical knowledge base or LKB (e.g. Atkins 1991, Boguraev and Levin 1990, Pustejovsky and Bergler 1991), we would like to briefly position our research framework in relation to these developments. The LKB projected by Boguraev and Levin 1990 differs from an LDB in two ways: 1) the LDB states lexical characteristics on a word-byword basis, while the LKB permits generalizations; and 2) the LKB permits inferencing, and thus the possibility of dynamically extending the Ac'rr~s DE COLING-92. NANTES. 23-28 AOt~T 1992 95 6 Paoc. OF COLING-92, NANTEs. AUG. 23-28, 1992 lexicon to accommodate new senses. Both characteristics are extremely important for the TKB as well: 1) a capacity for supporting generalisations is particularly relevant to terminology since terminological repositories have an important teaching function2; and 2) the accommodation of new senses is even more crucial to terminology than to the general lexicon since specialized languages grow so rapidly. While the TKB must share these characteristics, it differs from the LKB in one important way, which derives from the fundamental difference between general and specialized lexical items. This difference can be summarized in the following two principles: • an LKB must make explicit what a native speaker knows about concepts denoted by general lexical
منابع مشابه
Integrating Textual Knowledge and Formal Knowledge for Improving Traceability (Short Paper)
This article deals with traceability in knowledge repositories. More precisely, we concentrate on the role of terminological knowledge in the mapping between (informal) textual requirements and (formal) object models. We show that terminological knowledge facilitates the production of traceability links and model generation, provided that language processing technologies allow to elaborate semi...
متن کاملBuilding a Terminological Database from Heterogeneous Definitional Sources
An obstacle to understanding results across heterogeneous databases is the ability to determine conceptual connections between differing terminologies. In this paper, we present the two step approach which we have used to build a terminological database in order to address this issue. First we automatically built a heterogeneous collection of terms and definitions from two types of dynamic sour...
متن کاملBuilding a Large Knowledge Base Semi-Automatically
We describe a knowledge ngineering approach by which conceptual knowledge is extracted from an informal, semantically weak medical thesaurus (UMLS) and automatically converted into a formally sound description logics system. Our approach consists of four steps: concept definitions are automatically generated from the UMLS source, integrity checking of taxonomic and partonomic hierarchies is per...
متن کاملAn Approach to Representation and Extraction of Terminological Knowledge in ICALL
This paper addresses an innovative approach to computer assisted learning of foreign language terminology which involves supporting not only foreign language learning focused on specific terminology but also the enhancement of conceptual knowledge in the subject area. ITELS an intelligent tutoring system aimed at helping Bulgarians to learn English terminology in a particular subject area exemp...
متن کاملTerminological Knowledge Structure for Intermediary Expert Systems
An intermediary expert system (IES) helps both end users and professional searchers to conduct their online database searching. To provide advice about term selection and query expansion, an IES should include a terminological knowledge structure. Terminological attributes as well as other properties could provide the starting point for building a knowledge base, and knowledge acquisition could...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992